Placing Tracks in Context

In this portfolio I will examine how AI generated dance music compares to that of humans using a data set of both AI generated and human made tracks from the course computational musicology. The tracks are part of a collection (corpus) of music which is either composed by students of computational musicology, generated by AI or existing royalty free music. The features in the table below, just like their assigned values, were retrieved from essentia, an open-source C++ library for audio analysis and audio-based music information retrieval. All the tracks in the table have been analysed by this program which gave these results. The second table is a dataset which is filtered on if i considered the songs to be EDM, in order for me to copmpare only electorning dance tracks. Here is an explanation for what all the features mean:

These features together provide a clear overview of each track’s musical profile, making it easier to analyze and compare songs.

filename approachability arousal danceability engagingness instrumentalness tempo valence ai
ahram-j-1 0.2991498 3.417260 0.2711799 0.1026429 0.9141049 84 4.016967 TRUE
ahram-j-2 0.1889460 4.459196 0.4690239 0.5624804 0.3271964 95 3.767471 TRUE
aleksandra-b-1 0.1644350 5.343031 0.8357580 0.5665221 0.3702452 68 4.738314 FALSE
aleksandra-b-2 0.2511401 3.680455 0.6918470 0.1301249 0.8842366 104 4.044941 TRUE
angelo-w-1 0.1614367 3.621579 0.7069914 0.3248783 0.7907066 140 3.301473 FALSE
id approachability arousal danceability engagingness instrumentalness tempo valence ai
berend-b-1 0.1450785 5.021568 0.7396224 0.5278043 0.5858963 143 4.429538 TRUE
berend-b-2 0.2117881 5.656832 0.6107739 0.5786535 0.3487158 75 4.476577 TRUE
desmond-l-1 0.2629817 4.478108 0.2859525 0.4156072 0.6434987 135 3.936315 TRUE
desmond-l-2 0.2929443 5.076702 0.3010519 0.5524329 0.4989389 73 4.316221 TRUE
evan-l-2 0.1081999 5.602334 0.4800247 0.6272448 0.5513844 135 4.445124 TRUE

Information on my submitted tracks

Hidde-s-1:

I produced this song myself. I make music with clubs or festivals in mind as I like to DJ. For this track I tried to combine a mainstream house music sound and combine it with some more raw electronic sounds.

Hidde-s-2:

This is a track I generated with Suno. I asked chat gpt what the key characteristics of a dance track in a sweaty club in Amsterdam were: “Punchy four-on-the-floor kick, deep rolling bass, crisp shuffled hi-hats, sharp claps, detuned wide synth leads, tension-filled breakdown, rising FX, massive sidechained drop, high-energy, club-focused groove.”

My tracks in the class corpus

This is a graph which has mapped the engagingness of each song compared to its danceability. The colour scale is based on the tempo of each song. The first noticable aspect of the graph is the seemingly positive correlation between danceability and engagingness which is shown by the red trend line. On average it is clear that in most cases a high danceability value means that same song will have a high engagingness rating aswell. From the colour scaling it can also be noticed that most songs that have high scores for those features also have a higher tempo. This could mean that those features are highly correlated or that the way essentia measured these features is similar in terms of computational analysis. It would be interesting to look at why this correlation seems to be in place, for instance through examining the roll of instrumentallness, or genre in combination with this analysis.

The two points that are highlighted are a song I arranged by myself and one I generated with suno. What can be seen with these songs is that my own song performs higher in both danceability and engagingness than the AI song while they are the same genre and made with the same intention. We can’t conclude alot yet just from this example, however it lead us to the hypothesis: Ai generated music is distinguishable from human made music. Which we are going to evaluate in the next tabs.

AI and Human generated music are indistuingishable through clustering